Goto

Collaborating Authors

 stable video diffusion


T2VSafetyBench: Evaluating the Safety of Text-to-Video Generative Models

Miao, Yibo, Zhu, Yifan, Dong, Yinpeng, Yu, Lijia, Zhu, Jun, Gao, Xiao-Shan

arXiv.org Artificial Intelligence

The recent development of Sora leads to a new era in text-to-video (T2V) generation. Along with this comes the rising concern about its security risks. The generated videos may contain illegal or unethical content, and there is a lack of comprehensive quantitative understanding of their safety, posing a challenge to their reliability and practical deployment. Previous evaluations primarily focus on the quality of video generation. While some evaluations of text-to-image models have considered safety, they cover fewer aspects and do not address the unique temporal risk inherent in video generation. To bridge this research gap, we introduce T2VSafetyBench, a new benchmark designed for conducting safety-critical assessments of text-to-video models. We define 12 critical aspects of video generation safety and construct a malicious prompt dataset using LLMs and jailbreaking prompt attacks. Based on our evaluation results, we draw several important findings, including: 1) no single model excels in all aspects, with different models showing various strengths; 2) the correlation between GPT-4 assessments and manual reviews is generally high; 3) there is a trade-off between the usability and safety of text-to-video generative models. This indicates that as the field of video generation rapidly advances, safety risks are set to surge, highlighting the urgency of prioritizing video safety. We hope that T2VSafetyBench can provide insights for better understanding the safety of video generation in the era of generative AI.


Mods Are Asleep. Quick, Everyone Release AI Products

WIRED

The turmoil at OpenAI over the past five days has captivated the tech industry and kept entrepreneurs, journalists, and anyone who still has an X account glued to their timelines for the latest emoji updates and lower-case missives. In the meantime, some of the most prominent AI companies--including OpenAI--continued to do what Silicon Valley is known for: Drop new products. The unexpected firing of Sam Altman, OpenAI's CEO, was followed by an avalanche of new AI features from competitors, including Anthropic and Stable Diffusion. On Tuesday afternoon, in the midst of turmoil, OpenAI rolled out ChatGPT with voice capabilities for free to all users. OpenAI had pre-released this in late September, but only for paid users.


The AI startup behind Stable Diffusion is now testing generative video

Engadget

Stable Diffusion's generative art can now be animated, developer Stability AI announced. The company has released a new product called Stable Video Diffusion into a research preview, allowing users to create video from a single image. "This state-of-the-art generative AI video model represents a significant step in our journey toward creating models for everyone of every type," the company wrote. The new tool has been released in the form of two image-to-video models, each capable of generating 14 to 25 frames long at speeds between 3 and 30 frames per second at 576 1024 resolution. "At the time of release in their foundational form, through external evaluation, we have found these models surpass the leading closed models in user preference studies," the company said, comparing it to text-to-video platforms Runway and Pika Labs. Stable Video Diffusion is available only for research purposes at this point, not real-world or commercial applications.